# How to Speed Up EEGUnity with Built-in Parallelism

## 1. Introduction

EEGUnity supports parallel execution through:

- Global worker count: `UnifiedDataset(..., num_workers=...)`
- Batch backend selection: `batch_process(..., execution_mode=...)`

This tutorial explains when to use thread mode, process mode, or sequential mode.

## 2. Basic Configuration

```python
from eegunity import UnifiedDataset

ud = UnifiedDataset(
    dataset_path="your_dataset_root",
    domain_tag="your_domain_tag",
    num_workers=8,
)
```

- `num_workers=0` means sequential execution.
- `num_workers>0` enables parallel paths where supported.

## 3. Execution Modes in `batch_process`

`EEGBatch.batch_process` supports:

- `execution_mode='thread'`: recommended for I/O-bound tasks
- `execution_mode='process'`: recommended for CPU-bound tasks
- `execution_mode=None`: force sequential execution

Example:

```python
results = ud.eeg_batch.batch_process(
    con_func=lambda row: row["Completeness Check"] == "Completed",
    app_func=lambda row: row["File Path"],
    is_patch=False,
    result_type="value",
    execution_mode="thread",
)
```

## 4. Where Parallelism Is Commonly Applied

### 4.1 Parser Stage

When loading from `dataset_path`, parser steps can use `num_workers`, including:

- standard file scanning
- MAT/CSV parsing
- HDF5 EEGLAB `.set` fallback parsing
- BrainVision `.vhdr` sidecar fallback parsing
- WFDB header parsing

### 4.2 Batch Stage

Many heavy `eeg_batch` methods now choose backend explicitly, for example:

- filtering/resampling/normalization pipelines
- quality scoring
- hash calculation
- epoch workflows

Some write-heavy paths stay sequential intentionally to keep output consistency.

## 5. Choosing Worker Count

General guidance:

- Start at `os.cpu_count()`
- Reduce if memory pressure is high
- Increase slightly for storage/network bound workloads
- Benchmark on your own dataset

```python
import os
from eegunity import UnifiedDataset

ud = UnifiedDataset(
    dataset_path="your_dataset_root",
    domain_tag="your_domain_tag",
    num_workers=os.cpu_count(),
)
```

## 6. Practical Tips

- Use `execution_mode='process'` for CPU-heavy transforms.
- Use `execution_mode='thread'` for file I/O aggregation.
- Set `num_workers=0` during debugging for deterministic reproduction.
- Avoid nesting your own thread/process pools around EEGUnity batch calls.

## 7. Summary

Use `num_workers` to define available parallel workers and `execution_mode` to choose the right backend for each workload.